Draft genomes of halophilic Archaea strains isolated from brines of the Carpathian Foreland, Poland

Halophilic Archaea are a unique group of microorganisms living in saline environments. They constitute a complex group whose biodiversity has not been thoroughly studied. Here, we report three draft genomes of halophilic Archaea isolated from brines, representing the genera of Halorubrum, Halopenitus, and Haloarcula. Two of these strains, Boch-26 and POP-27, were identified as members of the genera Halorubrum and Halopenitus, respectively. However, they could not be assigned to any known species because of the excessive difference in genome sequences between these strains and any other described genomes. In contrast, the third strain, Boch-26, was identified as Haloarcula hispanica. Genome lengths of these isolates ranged from 2.7 Mbp to 3.0 Mbp, and GC content was in the 63.77%-68.77% range. Moreover, functional analysis revealed biosynthetic gene clusters (BGCs) related to terpenes production in all analysed genomes and one BGC for RRE (RiPP recognition element)-dependent RiPP (post-translationally modified peptides) biosynthesis. Moreover, the obtained results enhanced the knowledge about the salt mines microbiota biodiversity as a poorly explored environment so far.


Introduction
Archaea as representatives of an extremely diverse group of organisms are present in almost all types of environments worldwide. They can be found in the guts of humans and animals, in harsh habitats such as salt mines or hot springs and in food products [1][2][3]. Over the years, research on Archaea has been very limited due to the limited availability of methods for studying these microorganisms. However, in the last decade, significant progress has been made in understanding the taxonomic and metabolic diversity of this group of organisms [4]. It is associated with the development of research methods related to highthroughput DNA sequencing and metagenome analyses, which provided greater insight into the microbial "dark matter" [5]. Metagenomic analysis of various environments has allowed the identification of new taxonomic groups among Archaea, such as Asgard archaea or DPANN [6,7]. Research is currently underway to develop Archaea cultivation methods, which remains a significant challenge, and for many taxa, pure cultures have still not been isolated [8].
Archaea constitute a significant part of extremophilic microorganisms that are adapted to survive in conditions inaccessible to the most known microbes [4]. This group also includes halophilic microorganisms living in high-salinity environments. Most of them can be found in the classes Halobacteria and Methanomicrobia belonging to the phylum Euryarcheota [9]. Due to their ability to live in high salinity conditions, these microorganisms can be used for industrial purposes in bioprocesses [10,11]. Bioproducts such as ectoine, polyhydroxyalkanoate (PHA) or extremozymes produced by halophiles Ivyspring International Publisher found applications in many areas of biotechnology including the production of polyunsaturated fatty acids, biopolymers and osmoprotectants [11,12]. Research was also conducted to identify and characterize compounds that may find application in medicine, as was the case with Actinomycin C2, Streptomonomicin or halocins [13].
The microbiota of salt mines and brines is still not very well explored. Only a few strains of microorganisms have been isolated from this kind of habitat, and only a few metagenomes have been sequenced. Such species, like Halorhabdus rudnickae, Halorubrum trueperi and Halorubrum amylolyticum, were identified for the first time in samples collected from such environments [14][15][16]. Metagenomic analysis for salt mines has been performed so far for the Karak Salt Mine, Pakistan [2]. Despite the research carried out, there are still significant gaps in the knowledge about halophiles inhabiting salt mines and brines what determines the need for a thorough study of such environments for a better understanding their biodiversity and to discover possible applications of extremophiles.
The salt mine in Bochnia (southern Poland) is an example of an environment that, due to its uniqueness, can be a habitat for many previously unknown microbial strains and which has not yet been thoroughly investigated. It was established in the 13th century and was actively exploited until 1990. It means that it was one of the oldest and longest-exploited salt mines in Europe. Due to its long history and unique character, it was placed on the UNESCO World Heritage List [17]. It was first established in a fragment of the marine sediments of the Miocene salt-bearing formation. Currently, the historic mine consists of nine post-mining galleries reaching 350 m below the surface, and a significant part of the excavations is open to the public [18]. Since the end of salt mining, the activity of the Bochnia salt mine has changed its character and focused on tourism, recreation, and health protection.
In this paper, we report three genomes of Archaea isolated from brines collected from the Bochnia Salt Mine, located in southern Poland, near the city of Kraków. The analysed strains were characterized in terms of taxonomy and functionality, thus enriching the knowledge of the microbiota of the Bochnia Salt Mine.

Materials and Methods
Three strains of halophilic Archaea were isolated from brines collected in the Bochnia Salt Mine (49°58′09″N 20°25′03″E). Isolation of strains was carried out in accordance with the methodology used in earlier studies by Albuquerque et al. [15]. All strains were cultured on plates containing halobacteria medium (DSMZ 372) with 25% (w/v) NaCl concentration. The culture temperature was 37 °C for the strains Halorubrum Boch-26 and Halopenitus POP-27, and 28°C for the strain Haloarcula hispanica Boch-4. Genomic DNAs were extracted using the QIAamp DNA Mini Kit (Qiagen, Hilden, Germany).
All sequencing data are publicly available from the National Institutes of Health under WGS accessions JAQYWM000000000, JARANU000000000, JARANT000000000.

Results and Discussion
The three draft genomes sequences of the strains analysed contained between 33 and 177 contigs. Their sizes ranged from 2.7-3.0 Mbps. Completeness of the genomes was assessed using CheckM and was between 95.97%-97.93%. Contamination of genomes has been estimated at a level of 0.0%-0.95%, and the GC content was in the 63.77%-68.77% range. General genome feature statistics have been summarized in Table 1.
Based on the genomic sequences, a preliminary taxonomic annotation was performed using the gtdb-tk tool. During the analysis, strains Boch-4 and POP-27 were assigned to Haloarcula hispanica and Halopenitus malekzadehii, respectively. For strain Boch-26, assignment to species was impossible due to the too high difference between the genome sequence of this strain and those of other known genomes collected in the GTDB (the Genome Taxonomy Database). In order to better describe the evolutionary relationships between the strains analysed and other closely related species, a phylogenetic tree based on sequences of the gyrB gene was prepared. The reconstructed phylogenetic tree partially confirmed the taxonomic classification performed with gtdb-tk ( Figure 1). Strain Boch-26 was placed close to other Halorubrum species but did not form a single cluster with any of them. The same was true for strain Boch-4, which was placed in a single cluster with Haloarcula hispanica. In the case of strain POP-27, it was clustered together with Halopenitus persicus, but the sequence of the gyrB gene for Halopenitus malekzadehii was unavailable. To verify the results obtained from gtdb-tk and gyrB analysis, a phylogenetic tree based on the whole-genome sequences was prepared using the Genome BLAST Distance Phylogeny approach (Figure 2). The location of strains Boch-4 and Boch-26 on the reconstructed phylogenetic tree confirmed their taxonomic annotation. Strain Boch-4 clustered with Haloarcula hispanica, and the digital DNA-DNA hybridization (dDDH) value between these two genomes was 82.6%. Therefore, all analyses performed clearly indicated that strain Boch-4 belongs to the species Haloarcula hispanica. In the case of strain Boch-26, its taxonomic classification is also unequivocal. On a whole-genome sequence-based phylogenetic tree, this strain was placed among other Halorubrum species, but its distance from its other close relative, which was Halorubrum depositum, was too great to identify both strains as the same species. It was also confirmed by low dDDH between these two strains, which was 35.45%. The results for the POP-27 strain remain inconclusive. On the phylogenetic tree, this strain was located close to Halopenitus malekzadehii but dDDH between these two genomes was 59.2%, which proves that the genomes do not belong to the same species of microorganisms. The difference in the classification indications based on gtdb-tk and dDDH means that it is impossible to conclude unequivocally from the available data whether the POP-27 strain belongs to the species Halopenitus malekzadehii. Accordingly, the strain will be referred to as Halopenitus POP-27.
In order to characterise the functional profiles of the strains studied, a functional annotation of the genomes analysed was performed. The profiles obtained, based on eggNOG categories, are shown in Table 2. The Halopenitus strain POP-27 was characterised by a higher number of genes related to defence mechanisms, which amounted to 31 [1.27%]  The results obtained were the basis for an attempt to identify the BGCs present in the analysed genomes. AntiSMASH was used to identify BGCs. Two BGCs associated with terpene production were identified in all three strains. These clusters were not identical between the strains however, each one was organised around genes coding phytoene synthase. It can therefore be assumed that these strains may produce carotenoids. Three BGCs were identified in the Haloarcula hispanica strain. As previously mentioned two of them were associated with the terpene production, and the last one with RRE (RiPP recognition element)-dependent RiPP (ribosomally synthesized and post-translationally modified peptides) biosynthesis. Both of BGC related to the terpene production were organised around genes coding phytoene synthase. This enzyme is usually involved in carotenoid production. The first BGC contained 23 genes, and antiSMASH identified two strains (Haloarcula sp. CBA1115 and Haloarcula sp. K1K1) containing BGCs in which 100% of the genes showed similarity to genes in the BGC analysed. The second BGC consisted of 20 genes, and BGCs in which 95% of the genes showed similarity to genes in the analysed BGC were found in strains such as Haloarcula sp. K1K1, Haloarcula hispanica ATCC33960, and Haloarcula vallismortis DSM3756. The third BGC associated with RiPP biosynthesis contained 19 genes, and antiSMASH did not identify any highly similar BGCs. The most similar BGC with 18% of similar genes was associated with lasso peptide production and was identified in Halopiger xanaduensis SH-6 plasmid. The BGC analysed contained genes related to the synthesis of B1 and B2 proteins involved in lasso peptides production. A gene encoding a kinase that plays an important role in lasso peptide synthesis of was also identified in this BGC.
Concluding, the draft genomes of three Archaea strains have been reported and characterised in this paper. These genomes represent a significant value due to the uniqueness and isolation of the environment from which they were extracted from other saline environments. Significant differences in genome sequences compared to other known halophiles provide a unique insight into the diversity of microorganisms that can inhabit salt mines. The analyses conducted indicate that two of the strains analysed, Halorubrum Boch-26 and Halopenitus POP-27, may belong to hitherto unknown species. However, confirmation of this observation requires further analyses in aspects related to the physiology and morphology of the isolates studied. The obtained results allow us to expand our knowledge about the biodiversity of the halophilic Archaea living in salt mine environments.